fix: Fix deathlock during statically registering services#4657
fix: Fix deathlock during statically registering services#4657pavel-jares-bcm wants to merge 10 commits into
Conversation
Signed-off-by: Pavel Jareš <Pavel.Jares@broadcom.com>
Signed-off-by: Pavel Jareš <Pavel.Jares@broadcom.com>
Signed-off-by: Pavel Jareš <Pavel.Jares@broadcom.com>
Signed-off-by: Pavel Jareš <Pavel.Jares@broadcom.com>
Signed-off-by: Pavel Jareš <Pavel.Jares@broadcom.com>
|
Not directly related to your PR, but are we sure that this calculation of adding a fixed +2 in |
I am not sure, but I would say it make sense only for Modulith and the number is incorrect, there should be 3 for apiml (CS, Catalog, GW) and maybe also amount of statically registered services, but I don't consider it as part of this PR. Maybe we can create a issue for verification. |
| void givenStaticDefinition_whenFailRegistration_thenCorrectionAndExpectedNumberOfClientsSendingRenewsDidntChange() { | ||
| var renewCorrection = (ThreadLocal<Integer>) ReflectionTestUtils.getField(apimlInstanceRegistry, "RENEW_CORRECTION"); | ||
| renewCorrection.set(123); | ||
| ReflectionTestUtils.setField(apimlInstanceRegistry, "expectedNumberOfClientsSendingRenews", 5); |
There was a problem hiding this comment.
these lines are repeated, so they can be set globally
Signed-off-by: Pavel Jareš <Pavel.Jares@broadcom.com>
|



Description
When service is statically registered (since #4051) there is one additional lock in the class
ApimlInstanceRegistry(see the synchronization removal). In case of multiple threads it could lead to service freezing on start-up during reading the static definition file.Lock order of registerStatically:
Lock order of register:
Deathlock:
thead1 (register): read.lock()
thead2 (registerStatically): synchronized (lock)
thread1 (register): is waiting for synchronized block to be available (to be done by thread2)
thread2 (registerStatically): is waiting for read.unlock() (to be issued by thread1)
Note: locks are used also by other methods but it should be the issue.
The source is the REST call from any south-bound service during the initialization of static services. The risk is higher when the static definition file or amount of service is bigger.
The additional lock (see the synchronized (lock) in registerStatically) was added because the original method register shouldn't change the property
expectedNumberOfClientsSendingRenews. Using the same lock should prevent exclusivity to update the field. It is not correct solution because of orders of locks.The new solution store information about the change (correction - the oposite operation) in a ThreadLocal, so the locks are not necessary now.
The similar issue was in cancel method as well. But it is not big deal as this method is used very rarelly.
Other changes in the PR becomes from analysis that using reflection is not needed at this version of Eureka (there were very probably changes in the overriden code). We can call the whole method now and it is not necessary to reimplement it via calling private methods via reflection.
Linked to # (issue)
Part of the # (epic)
Type of change
Please delete options that are not relevant.
Checklist:
For more details about how should the code look like read the Contributing guideline